Supervised dimension reduction with topic models
نویسندگان
چکیده
We consider supervised dimension reduction (SDR) for problems with discrete variables. Existing methods are computationally expensive, and often do not take the local structure of data into consideration when searching for a low-dimensional space. In this paper, we propose a novel framework for SDR which is (1) general and flexible so that it can be easily adapted to various unsupervised topic models, (2) able to inherit scalability of unsupervised topic models, and (3) can exploit well label information and local structure of data when searching for a new space. Extensive experiments with adaptations to three models demonstrate that our framework can yield scalable and qualitative methods for SDR. One of those adaptations can perform better than the state-of-the-art method for SDR while enjoying significantly faster speed.
منابع مشابه
An effective framework for supervised dimension reduction
We consider supervised dimension reduction (SDR) for problems with discrete inputs. Existing methods are computationally expensive, and often do not take the local structure of data into consideration when searching for a low-dimensional space. In this paper, we propose a novel framework for SDR with the aims that it can inherit scalability of existing unsupervised methods, and that it can expl...
متن کاملLogisticLDA: Regularizing Latent Dirichlet Allocation by Logistic Regression
We present in this paper a supervised topic model for multi-class classification problems. To incorporate supervisory information, we jointly model documents and their labels in a graphical model called LogisticLDA, which mathematically integrates a generative model and a discriminative model in a principled way. By maximizing the posterior of document labels using logistic normal distributions...
متن کاملTwo models for Bayesian supervised dimension reduction
We study and develop two Bayesian frameworks for supervised dimension reduction that apply to nonlinear manifolds: Bayesian mixtures of inverse regressions and gradient based methods. Formal probabilistic models with likelihoods and priors are given for both methods and efficient posterior estimates of the effective dimension reduction space and predictive factors can be obtained by a Gibbs sam...
متن کاملAnalysis of Correlation Based Dimension Reduction Methods
Dimension reduction is an important topic in data mining and machine learning. Especially dimension reduction combined with feature fusion is an effective preprocessing step when the data are described by multiple feature sets. Canonical Correlation Analysis (CCA) and Discriminative Canonical Correlation Analysis (DCCA) are feature fusion methods based on correlation. However, they are differen...
متن کاملSupervised Dimension Reduction Using Bayesian Mixture Modeling
We develop a Bayesian framework for supervised dimension reduction using a flexible nonparametric Bayesian mixture modeling approach. Our method retrieves the dimension reduction or d.r. subspace by utilizing a dependent Dirichlet process that allows for natural clustering for the data in terms of both the response and predictor variables. Formal probabilistic models with likelihoods and priors...
متن کامل